Dirk Eddelbuettel: R and Big Data at Big Data Summit at UI Research Park
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
DTRT
with it (e.g. dashes turn to underscores,
earmuffs turn to all-caps.)true_division
);; good:
(with [fd (open "/etc/passwd")]
(print (.readlines fd)))
;; bad:
(with [fd (open "/etc/passwd")]
(print (fd.readlines)))
threading macro
throughout code
where it makes sense.
;; good:
(import [sh [cat grep]])
(-> (cat "/usr/share/dict/words") (grep "-E" "tag$"))
;; bad:
(import [sh [cat grep]])
(grep (cat "/usr/share/dict/words") "-E" "tag$")
;; good (and prefered):
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; still OK:
(defn fib [n]
(if (<= n 2) n (+ (fib (- n 1)) (fib (- n 2)))))
;; still OK:
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; Stupid as hell
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; good (and prefered):
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
;; Stupid as hell
(defn fib [n]
(if (<= n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))
)
) ; GAH, BURN IT WITH FIRE
;; bad (and evil)
(defn foo (x) (print x))
(foo 1)
;; good (and prefered):
(defn foo [x] (print x))
(foo 1)
svn-all-fast-export --identity-map authors.txt --rules pkg-ace.rules svn-pkg-aceHere's the content of the pkg-ace.rules configuration file that was used:
create repository pkg-ace end repository match /trunk/ repository pkg-ace branch master end match match /(branches tags)/([^/]+)/ repository pkg-ace branch \2 end matchThe author mapping file authors.txt being:
markos = Konstantinos Margaritis <email-hidden> mbrudka-guest = Marek Brudka <email-hidden> pgquiles-guest = Pau Garcia i Quiles <email-hidden> tgg = Thomas Girard <email-hidden> tgg-guest = Thomas Girard <email-hidden>The tool sample configuration file merged-branches-tags.rules recommends to post-process tags, which are just a branch in SVN. That's why the configuration file above treats branches as tags. The conversion was indeed fast: less than 1 minute.
svn tags as branches: Branches are marked with green rectangles, and tags with yellow arrows. What we have here (expected given our configuration of the tool) are branches (e.g. 5.4.7-5) corresponding to tags, and tags matching the SVN tagging commit (e.g. backups/5.4.7-5@224). We'll review and fix this.
merged code that did not appear as such: Branches that were not merged using svn merge look like they were not merged at all.
commits with wrong author: Before being in SVN, the repository was stored in CVS. When it was imported into SVN, no special attention was given to the commit author. Hence I got credited for changes I did not write.
obsolete branches: The tool leaves all branches, including removed ones (with tag on their end) so that you can decide what to do with them.
missing merges: The branch 5.4.7-12 was never merged into the trunk!
#!/usr/bin/env ruby # # retag.rb # # Small script to create an annotated tag, specifying commiter as well as # date, and tag comment. # # Based on Scott Chacon "Custom Importer" example. # # Arguments: # $1 -- tag name # $2 -- sha-1 revision to tag # $3 -- committer in the form First Last <email> # $4 -- date to use in the form YYYY/MM/DD_HH:MM:SS def help puts "Usage: retag <tag> <sha1sum> <committer> <date> <comment>" puts "Creates a annotated tag with name <tag> for commit <sha1sum>, using " puts "given <committer>, <date> and <comment>" puts "The output should be piped to git fast-import" end def to_date(datetime) (date, time) = datetime.split('_') (year, month, day) = date.split('/') (hour, minute, second) = time.split(':') return Time.local(year, month, day, hour, minute, second).to_i end def generate_tag(tag, sha1hash, committer, date, message) puts "tag # tag " puts "from # sha1hash " puts "tagger # committer # date +0000" print "data # message.size \n# message " end if ARGV.length != 5 help exit 1 else (tag, sha1sum, committer, date, message) = ARGV generate_tag(tag, sha1sum, committer, to_date(date), message) end
me@mymachine$ echo 6a6d48814d0746fa4c9f6869bd8d5c3bc3af8242 11cf74d4aa996ffed7c07157fe0780ec2224c73e 898ad49b61d4d8d5dc4072351037e2c8ade1ab68 >> .git/info/grafts
#!/bin/sh br="HEAD" TARG_NAME="Raphael Bossek" TARG_EMAIL="hidden" export TARG_NAME TARG_EMAIL filt=' if test "$GIT_COMMIT" = 546db1966133737930350a098057c4d563b1acdf -o \ "$GIT_COMMIT" = 23419dde50662852cfbd2edde9468beb29a9ddcc; then if test -n "$TARG_EMAIL"; then GIT_AUTHOR_EMAIL="$TARG_EMAIL" export GIT_AUTHOR_EMAIL else unset GIT_AUTHOR_EMAIL fi if test -n "$TARG_NAME"; then GIT_AUTHOR_NAME="$TARG_NAME" export GIT_AUTHOR_NAME else unset GIT_AUTHOR_NAME fi fi ' git filter-branch $force --tag-name-filter cat --env-filter "$filt" -- $br(Script edited here; there were much more commits written by Raphael.)
Important
It's important to realize that the whole selected branch history is rewritten, so all objects id will change. You should not do this if you already published your repository.
Hint
Once git filter-branch completes you get a new history, as well as a new original ref to ease comparison. It is highly recommended to check the result of the rewrite before removing original. To shrink the repo after this, git clone the rewritten repo with file:// syntax -- git-filter-branch says it all.
Add graft points where needed.
Clean tags and branches. Using git tag -d, git branch -d and the Ruby script above it was possible to recreate tags. During this I was also able to add missing tags, and remove some SVN errors I did -- like committing in a branch created under tags/.
Remove obsolete branches.
Merge missing pieces. There were just two missing debian/changelog entries. I did this before git filter-branch because I did not find a way to use the tool correctly with multiple heads.
Fix commit author where needed. Using the shell script above Raphael is now correctly credited for his work.
[1] | http://lists.alioth.debian.org/pipermail/pkg-ace-devel/2011-March/002421.html |
[2] | available in Debian as svn-all-fast-export |
kdeinit4: preparing to launch /usr/bin/knotify4At the user/password prompt of kdm I can login, the KDE splash screen appears and then, suddenly, the connection fails and I'm back at the kdm login again. I tried to look for already existing bug reports, but KDE is quite large and with many programs. Are there any pointers for a bug report or even a solution/fix for the problem, dear LazyWeb? UPDATE 21:51:
Connecting to deprecated signal QDBusConnectionInterface::serviceOwnerChanged(QString,QString,QString)knotify(16474) KNotify::event: 1 ref= 0
QMetaObject::invokeMethod: No such method KUniqueApplication::loadCommandLineOptionsForNewInstance()kdeinit4: preparing to launch /usr/bin/plasma-desktop
kded4: Fatal IO error: client killedkdeinit4: Fatal IO error: client killed
kdeinit4: sending SIGHUP to children.
klauncher: Exiting on signal 1
/join #debian-miniconf-berlin
/join #debian-miniconf-berlin
# get the packages
corsac@hidalgo: sudo aptitude -R install cdebootstrap chroot
# create the i386 chroot:
corsac@hidalgo: sudo cdebootstrap -a i386 sid chroot
http://ftp.fr.debian.org/debian
# bind-mount /tmp for Xorg in the chroot:
corsac@hidalgo: sudo mount -o bind /tmp debian/chroot/tmp
# Entering chroot:
corsac@hidalgo: sudo chroot debian/chroot /bin/bash
# configure apt:
root: echo "deb http://ftp.fr.debian.org/debian/ sid main contrib
non-free" > /etc/apt/sources.list
root: aptitude update
# installing packages:
root: aptitude -R install iceweasel sun-java5-plugin
# no need to run iceweasel as root:
root: adduser corsac
# done for inside the chroot
We now need two things in the chroot:
corsac@hidalgo: cp ~/.Xauthority debian/chroot/home/corsac/ Time to re-enter the chroot:
corsac@hidalgo: sudo chroot debian/chroot
/bin/bash
root: su - corsac
# we export the DISPLAY to use host X
corsac: export DISPLAY=:0.0
# ready to go
corsac: iceweasel
Now you should be able to go to the website and declare. No need
for symbolic link or LD_LIBRARY_PATH hack. I failed to use
sun-java6-plugin.
Don't hesitate to purge your chroot and restart from a clean
one. You can also clean the folder where the shared lib is stored,
in ~/.TaoUSign.
Hope that helps.
usb 1-1: new full speed USB device using uhci_hcd and address 8 usb 1-1: configuration #1 chosen from 1 choice drivers/usb/serial/usb-serial.c: USB Serial support registered for FTDI USB Serial Device ftdi_sio 1-1:1.0: FTDI USB Serial Device converter detected drivers/usb/serial/ftdi_sio.c: Detected FT232BM usb 1-1: FTDI USB Serial Device converter now attached to ttyUSB0 usbcore: registered new driver ftdi_sio drivers/usb/serial/ftdi_sio.c: v1.4.3:USB FTDI Serial Converters DriverI then fooled around with minicom, and discovered that the little transistor thing I'd been ignoring was indeed the temperature sensor, as I got a reading of zero back (when using this program I found on the 'net). So I went to bed, and this morning did a bit of messing around with the sensor, and with a bit of creative bending, I've got it sitting in the S1 holes without requiring any soldering. It tells me the linen cupboard is about 44 degrees Celsius. Warm, but I don't think it's in any immediate danger of bursting into flames. Wouldn't surprise me if some of the gear in there isn't too keen about the temperature though. At least we won't have to worry about mold. Next step is to convince cacti to graph it, and nagios to monitor it, and we're in business. Here's a little Python program I knocked up to grab the temperature. pyserial is nice. Read on, Macduff!
Next.